rank | frequency | n-gram |
---|---|---|
1 | 17721 | -а |
2 | 12078 | -и |
3 | 8700 | -е |
4 | 8159 | -о |
5 | 6170 | -т |
rank | frequency | n-gram |
---|---|---|
1 | 6162 | -та |
2 | 5427 | -те |
3 | 3650 | -ни |
4 | 3538 | -то |
5 | 3489 | -ия |
rank | frequency | n-gram |
---|---|---|
1 | 5046 | -ите |
2 | 4139 | -ата |
3 | 1787 | -ото |
4 | 1521 | -ето |
5 | 1285 | -ния |
rank | frequency | n-gram |
---|---|---|
1 | 1673 | -ните |
2 | 1502 | -ната |
3 | 1119 | -ката |
4 | 850 | -нето |
5 | 806 | -ията |
rank | frequency | n-gram |
---|---|---|
1 | 725 | -ането |
2 | 651 | -ската |
3 | 421 | -ските |
4 | 320 | -нието |
5 | 307 | -ското |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings